Project 2

Hypothesis: Across the United States, mortality rates associated with cold weather are higher than mortality rates associated with hot weather.

By Tim and Ted

Exploratory Visuals in QGIS

Crude Total Heat Rate by State 1999-2020 Crude Total Cold Rate by State 1999-2020 Temperature Average by State 1999-2020

These first visuals were to just give us a sense of what the data looks like and how we can manipulate it or use it.

Merge cold deaths and heat deaths with state boundaries (initial exploration)

Cold-related deaths were identified using the following ICD-10 codes:

Heat-related deaths were identified using the following ICD-10 codes:

These were preprocessed/exported using the CDC WONDER tool.

First look at the data

Here is our first look at the data. So far all we have done is format the data by renaming and dropping certain columns, merged the hot and cold death tables based on the state code (FIPS), and recalculated the death rates for each state (because some of the rates were labeled "unreliable"). Then, merged the table with a state shape file in order to add the geometry for each state and then plot it.

This first view is okay but it is quite zoomed out and hard to see the details of the plots. We will have to either zoom in or drop the geometry on the right of the plot to automatically zoom it in. We have also not pulled in the weather data yet, some ideas for this would be a supplemental scatterplot with number of deaths or death rate, and number of days below freezing or above 90 degrees. We will also likely trash the total death choropleths because it does not account for population like death rate does. Another approach is looking at rates at a county level. However, this will likely result in too many supressed values.

Note: Simplified the geometry for the counties because otherwise the kernel would shutdown.

2nd Revision

This is interesting to look at. However, the county rates are not entirely accurate since we replaced suppressed values with a 1. We did this because the CDC suppresses any row with a number of deaths less than or equal to 10, so we know at least one person died due to a temperature related death, but we are not sure how many.

For this reason we will likely have to stick with a "by state" analysis even though the county rate plots are much more detailed. We did similar processing to the first revision, but this time on a county level. This included formatting the tables again, merging the heat and cold deaths to a county shape file by the COUNTYFP and STATEFP. We also calculated the rates after replacing the suppressed values with 1, again this is why the map is not entirely accurate.

We then plotted all 4 maps just to compare how they look, this time zoomed in on the 50 states - ignoring overseas territories that were in our shapefile. Gray colored counties denote 0 deaths for temperature related deaths. One thing we could do for the county plot is to keep track of which values were originally suppressed and apply a pattern accross those counties to see how many are just estimates of the rate. We still need to tie in the number of days above or below a certain temperature threshold.

Note: This is more of a exploratory visual to see how many county death rates are suppressed (black). Obviously, most of them are. This is why we decided to stick to a state-level analysis to stay true to the data integrity.

3rd Revision

These plots look a lot better. This choropleth gives us an idea of which states have a higher death rate due to cold related deaths versus heat related deaths.

To get this value we took the difference between the cold death rate and the heat death rate. Values that are positive show a higher death rate from cold, and values that are negative show a higher death rate from heat.

There certainly seems to be more blue colored states which means that per 100,000 people there is a higher rate of deaths due to cold temperatures. The bar graphs are mostly supplemental, but they show heat-related versus cold-related mortality rates for each state and the number of days where the minimum temperature in the state was below 0°C (32°F) or where the maximum temperature was above 32°C (89.6°F).

From the first bar graph we can see that a few states have much higher mortality rates due to heat, such as Arizona or Texas, but the cold related mortality rates seem to be more consistent across the states.

From the second bar graph we can see that most of the states have a higher amount of days below freezing.

Critique: We still need to add a border to the map and a cartographer's block. It may be worth while to do an inset for Alaska and possibly Hawaii to allow us to zoom in on the map even more. It may also be interesting to look at poverty values for each of the states and see if there is a relationship between the amount of deaths and poverty rate.

4th Revision

We have now tied in poverty to a choropleth, but not sure this shows us much. It might look better as a scatter plot showing cold deaths and heat deaths - to see if there is a relationship or not. This data was given in an Excel file with different tables for each year, so we had to concatenate multiple tables from each sheet to put them into one DataFrame and then filtered that for the years that we wanted. We then summed the population and people in poverty counts for each state and merged this table with our GeoDataFrame from earlier to get all our data into on GeoDataFrame.

At this point we have done most of the calculations we can. For the final revision, we think it would be best to have the (1) choropleth with differences in death rates, (2) a bar charts comparing cold days versus hot days, (3) another bar chart comparing cold and heat related mortality rates, (4) and a scatter plot of the poverty percentage vs total deaths for each state.

Critique: We realized that we have not tansformed the crs. We will convert this to EPSG:5070 to keep the states in reasonable proportions. We will fine tune each of these plots for the final visuals.

Interesting Insight: There seems to be a positive correlation between poverty percent and heat death rates but a negative correlation between cold death rates and poverty percent. Maybe people who live in poverty don't have as much access to air conditioning, but are able to access heat?

Moran's I Analysis: We ran Moran's I analysis on the death rate difference (cold-hot), cold death rate, and heat death rate to check and see if neighbors nearby have similar values. Moran's I was positive for all three which tells us that each value for the states are clustered near eachother. Also, the p-value was less than 0.05 which means the clustering is not random.

Final Visuals (before prototype/development)

Hypothesis Restated:

Across the United States, mortality rates associated with cold weather are higher than mortality rates associated with hot weather.

Comments from Jesse & the class

After reviewing our visuals with the class, there were a few things noted that we could address.

Converted rates to average over the 21 years for better units and convert crs

Analysis: This map is a choropleth of the age group with the highest deaths for each state. To get this map we found the age group for each state with the most amount of deaths from 1999-2020 and merged them into a GeoDataFrame to add geometry for the values. This gives us an idea of which age group is most at risk for weather related deaths. States colored in black are no data because the values were suppressed for the age groups or 0. It'd be nice to scale down Alaska, and scale up Hawaii and then inset them for each of the maps.

Introduction

Extreme temperatures pose significant risks to people across the United States. Both cold and heat can lead to death, but these mortality rates are not equal. Which type of extreme temperature causes more deaths? Do socioeconomic conditions play a role? Are certain age groups more affected than others?

This project aims to explore these questions by analyzing state level mortality data from 1999–2020, comparing deaths related to cold and heat across demographic and environmental factors.

Hypothesis:

Across the United States, mortality rates associated with cold weather are higher than mortality rates associated with hot weather.

Data Explanation

We obtained mortality data from CDC WONDER, querying the Multiple Cause of Death (1999–2020) database. This includes cases where exposure to extreme temperatures was either the underlying cause or a contributing factor of death.

Cold-related deaths were identified using the following ICD-10 codes:

Heat-related deaths were identified using:

For each state, we calculated the average annual mortality rate from 1999–2020 as:

Average Mortality Rate = ((Temperature related deaths / Total population) × 100,000) / 21

This represents the average number of deaths per 100,000 people per year associated with the above ICD-10 codes.

Weather data were obtained from the PRISM Climate Group, which provides daily minimum, mean, and maximum temperature data for each state.

The number of hot and cold days per state was averaged across the 21-year period.

Finally, poverty percentages were derived from datasets published by the U.S. Census Bureau, representing the average poverty rate per state across the same time span.

Analysis

This first plot is a simple bar chart comparing the average number of days below 0 degrees celsius (32 degrees farenheit) and above 32 degrees celsisus (89.4 degrees farenheit) for each state from 1999-2020. It seems that colder days are much more frequent than hot days.

This bar chart represents the average mortality rate for cold and heat related deaths by state from 1999-2020. It seems that cold mortality rates seem to be greater in most of the states besides a select few such as Arizona and Nevada.

These scatterplots are seeing if there is any correlation between the mortality rates and the poverty percent for each state. Cold mortality rates seem to have a negative correlation, while heat mortality rates have a positive one. This is opposite of what we would have thought, but may suggest that people in poverty have more access to infrastructure with heat during cold months.

This choropleth represents which ten-year age group has the most cold and heat related deaths. For both maps it seems that people 45 and older are most affected.

Note: Alaska and Hawaii are not to scale

This choropleth show the difference between cold and heat mortality rates for each state. There appears to be more states colored in blue then red. This tells us that overall cold mortality rates are higher.

Variable Moran's I p-value
Death Rate Difference 0.218 0.0120
Cold Mortality Rate 0.193 0.0150
Heat Mortality Rate 0.172 0.0310

Moran’s I measures spatial autocorrelation, indicating whether similar values cluster geographically. The difference in mortality rates had a Moran’s I of 0.218 (p = 0.012), cold mortality 0.193 (p = 0.015), and heat mortality 0.172 (p = 0.031), all showing statistically significant positive spatial clustering. This suggests that states with similar mortality rates tend to be near each other rather than randomly distributed.

Type of Mortality Rate (per 100,000) Year/Period
Drug Overdose 32.4 2021
Homicide 5.9 2023
Cold related Mortality 1.89 Average 1999–2020
Heat related Mortality 0.61 Average 1999–2020

This table provides a comparison of mortality rates in the United States for selected causes. Cold related deaths (1.89 per 100,000) occur at a higher rate than heatrelated deaths (0.61 per 100,000), but both are substantially lower than deaths due to homicide (5.9 per 100,000) and drug overdose (32.4 per 100,000).

Conclusion

The first plot, a bar chart showing cold and hot days by state, indicates that most states experience more cold days (blue) than hot days (red). This helps explain why cold related mortality is generally higher.

The second plot, a bar chart comparing cold (blue) and heat (red) mortality rates by state, reinforces this trend. With the exception of Arizona and Nevada, cold related mortality rates are consistently higher than heat related rates.

The third plot, a scatterplot comparing poverty percentage to mortality rates, provides insight into potential socioeconomic effects. Cold mortality shows a slight negative correlation with poverty, while heat mortality shows a slightly stronger positive correlation. This may suggest that people in poverty have better access to heating during cold periods but limited access to air conditioning during heat waves, though further study would be needed to confirm this.

The fourth plot, a pair of choropleths showing the most affected age groups for cold and heat related deaths, highlights that different age groups are more vulnerable depending on temperature extremes.

The fifth and final plot, a choropleth of the difference in mortality rates, provides the strongest support for our hypothesis. Positive values (blue) indicate higher cold related mortality, while negative values (red) indicate higher heat related mortality. Most states are shaded light to moderate blue, showing that cold-related deaths are generally higher. Alaska exhibits the largest difference, whereas Arizona and Nevada have higher heat-related mortality.

Spatial analysis using Moran’s I autocorrelation confirms that states with similar mortality rates are geographically clustered rather than randomly distributed, with all Moran’s I values positive and p-values < 0.05.

Finally, the average mortality rate from 1999 to 2020 was 1.89 per 100,000 for cold-related deaths and 0.61 per 100,000 for heat-related deaths, confirming that cold-related mortality is substantially higher across the U.S. over this period. However, this is relatively small compared to other mortality rates, such as drug overdose.

Sources